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(D Abstract 

The Reduced-Basis Control- Variate Monte-Carlo method was introduced recently in [S. Boyaval and T. 
Lelievre, CMS, 8 2010] as an improved Monte-Carlo method, for the fast estimation of many parametrized 
expected values at many parameter values. We provide here a more complete analysis of the method in- 
y-^ eluding precise error estimates and convergence results. We also numerically demonstrate that it can be 
. useful to some parametrized frameworks in Uncertainty Quantification, in particular (i) the case where the 
parametrized expectation is a scalar output of the solution to a Partial Differential Equation (PDE) with 
^ stochastic coefficients (an Uncertainty Propagation problem), and (ii) the case where the parametrized ex- 
I ^ i pectation is the Bayesian estimator of a scalar output in a similar PDE context. Moreover, in each case, a 
PDE has to be solved many times for many values of its coefficients. This is costly and we also use a re- 
^ duced basis of PDE solutions like in [S. Boyaval, C. Le Bris, Nguyen C, Y. Maday and T. Patera, CMAME, 
198 2009]. This is the first combination of various Reduced-Basis ideas to our knowledge, here with a view 
to reducing as much as possible the computational cost of a simple approach to Uncertainty Quantification. 

Keywords: Monte-Carlo method, Variance reduction. Control variate. Reduced Basis method. Partial 
Differential Equations with stochastic coefficients. Uncertainty Quantification, Bayesian estimation . 
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1. Introduction Uncertainty Quantification (UQ) too, possibly in 

combination with the standard RB method in a 



The Reduced-Basis (RB) control-variate „„„ ^ ^ 
rS ^ ' PDE context. 



^ Monte-Carlo (MC) method was recently intro- 
duced in 1 1 1 to compute fast many expectations There is a huge literature on the UQ subject, 
of scalar outputs of the solutions to parametrized Indeed, to be actually predictive in real-life situ- 
ordinary Stochastic Diff'erential Equations (SDEs) ations ||2l, most numerical models require (i) to 
at many parameter values. But as a simple, calibrate as much as possible the parameters and 
generic MC method with reduced variance, the (ii) to quantify the remaining uncertainties propa- 
RB control-variate MC method can also be useful gated by the model. Besides, the latter two steps 
in other parametric contexts and the main goal are complementary in an iterative procedure to im- 
of this article is to show that it can be useful to prove numerical models using datas from experi- 
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ments: quantifying the variations of outputs gen- 
erated by input parameters allows one to calibrate 
the input uncertainties with data and in turn re- 
duces the epistemic uncertainty in outputs despite 
irreducible aleatoric uncertainty. Various numer- 
ical techniques have been developped to quantify 
uncertainties and have sometimes been used for 
years ll3]|4|. But there are still a number of chal- 
lenges ElgllTl. 

For PDEs in particular, the coefficients are typ- 
ical sources of uncertainties. One common mod- 
elling of these uncertainties endows the coeffi- 
cients with a probability distribution that presum- 
ably belongs to some parametric family and the 
PDEs solutions inherit the random nature of the 
uncertainty sources. A Bayesian approach is often 
favoured to calibrate the parameters in the proba- 
bility law using observations of the reality |]8] |9l. 
But the accurate numerical simulation of the PDEs 
solutions as a function of parametrized uncertain 
coefficients is a computational challenge due to its 
complexity, and even more so is the numerical op- 
timization of the parameters in uncertain models. 
That is why new/improved techniques are still be- 
ing investigated lITOl [TT] . Our goal in this work 
is to develop a practically useful numerical ap- 
proach that bases on the simple MC method to 
simulate the probability law of the uncertain coef- 
ficients. We suggest to use the RB control-variate 
MC method in some UQ frameworks to improve 
the computational cost of the naive MC method, in 
particular in contexts where some coefficients in a 
PDE are uncertain and other are controlled. 

There exist various numerical approaches to 
UQ. The computational cost of MC methods is 
certainly not optimal when the random PDE solu- 
tion is regular, see e.g. fT2\. But we focus here on 
MC methods because they are (a) very robust, that 
is useful when the regularity of the solution with 



respect to the random variables degrades, and (b) 
very easy to implement (they are non-intrusive in 
the sense that they can use a PDE numerical solver 
as a black-box, with the values of the PDE coef- 
ficients as input and that of the discrete solution 
as output). Besides, note that even when the ran- 
dom PDE solution is very regular with respect to 
the random variables, it is not yet obvious how al- 
gorithms can take optimal profit of the regularity 
of random PDE solutions and remain practically 
efficient as the dimension of the (parametric) prob- 
ability space increases, see e.g. [13]. So, focusing 
on a MC approach, our numerical challenge here 
is basically two-sided: (i) on the probabilistic side, 
one should sample fast the statistics of the random 
PDE solution (or of some random output that is 
the quantity of interest), and (ii) on the determin- 
istic side, one has to compute fast the solution to 
a PDE for many realizations of the random coef- 
ficients. It was proposed in lil4J to use the RB 
method in order to reduce the numerical complex- 
ity of (ii), but this does not fully answer the nu- 
merical challenge. In particular, although the RB 
method can improve naive MC approaches at no- 
cost (since the selection of the reduced basis for the 
PDE solutions at various coefficients values can be 
trained on the same large sample of coefficients 
values that is necessary to the MC sampling), the 
resulting MC approach might still be very costly, 
maybe prohibitively, due to the large number of 
realizations that is necessary to accurately sample 
the statistics of the PDE solution (side (i) of our 
challenge above). In this work, we thus tackle the 
question how to reduce the numerical complexity 
of (i). We have in mind the particular but useful 
case where one is interested in the expected value 
of a random scalar output of the random PDE so- 
lution as a function of a (deterministic) control pa- 
rameter, typically another (deterministic) coeffi- 
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cient in the UQ problem which is "under control". 
(Think of the construction of response surfaces for 
a mean value as a function of control parameters.) 
A similar parametric context occurs in Bayesian 
estimation, sometimes by additionally varying the 
hyper parameters or the observations. In any case, 
our goal is to reduce the computational cost of a 
parametrized (scalar) MC estimation when the lat- 
ter has to be done many times for many values 
of a parameter, and we illustrate it with examples 
meaningful in a UQ context. 

To accelerate the convergence of MC methods 
as numerical quadratures for the expectation of a 
random variable, one idea is to tune the sampling 
for a given family of random variables Uke in the 
quasi-Monte-Carlo (qMC) methods Ill5l [161 [TTl. 
Another common idea is to sample another ran- 
dom variable with same mean but with a smaller 
variance. Reducing the variance allows one to 
take smaller MC samples of realizations and yet 
get MC estimations with confidence intervals of 
similar (asymptotic) probability. Many techniques 
have been designed in order to reduce the vari- 
ance in various contexts ifTSl [T9l . Our RB control- 
variate MC method bases on the so-called control- 
variate technique. It has a specific scope of appli- 
cation in parametric contexts. But it suits very well 
to some computational problems in mathematical 
finance and molecular dynamics as shown in |[T], 
and can be useful in UQ as we are going to see. 

The paper is organized as follows. In Section|2] 
we recall the RB control-variate technique as a 
general variance reduction tool for the MC approx- 
imation of a parametrized expected value at many 
values of the parameter. The presentation is a bit 
different to that in [IJ, which was more focused 
on SDEs. Moreover, we also give new error esti- 
mates and convergence results. In Section [3] the 
RB control-variate MC method is applied to com- 



pute the mean of a random scalar output in a model 
PDE with stochastic coefficients (the input uncer- 
tainty) at many values of a control parameter. In 
Section]?! it is applied to Bayes estimation, first for 
a toy model where various parametric contexts are 
easily discussed, then for the same random PDE as 
in section |3] 

We also note that this work does not only im- 
prove on the RB approach to UQ but also 
on an RB approach to Bayesian estimation pro- 
posed in with a deterministic quadrature for- 
mula to evaluate integrals. For both applications, 
to our knowledge, our work is the first attempt at 
optimally approximating the solution with a sim- 
ple MC/FE method by combining RB ideas of two 
kinds, stochastic and deterministic ones f2TT|. Note 
that for convenience of the reader non-expert in 
RB methods, the standard RB method Il22ll23]| is 
briefly recalled in Section [33] 

2. The RB Control- Variate MC Method 

The RB control-variate technique is a generic 
variance reduction tool for the MC approxima- 
tion of a parametrized expected value at many 
values of the parameter. In this section we re- 
call the technique for the expectation E{Z^) of a 
generic square-integrable random variable Z'* e 
parametrized by A. The principle for the reduction 
of computations is based on the same paradigm as 
the standard RB method and allows one to acceler- 
ate the MC computations of many E{Z'^) at many 
values of A. Our presentation is slightly different 
than the initial one in and gives new elements of 
analysis (error estimates and convergence results). 

2.1. Principles ofRB control-variate MC method 

Let P be a probability measure such that Z ' is 
a random variable in Lp for all parameter values A 
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in a given fixed range A. Assume one has an al- 
gorithm to simulate the law of whatever /I e A. 
Then, at any /I e A, one can define MC estima- 
tors EuiZ^) that provide useful approximations of 
E{Z^), by virtue of the strong law of large numbers 



1 ^ 



M— >t)o 



^ £(Z ) , 



(2.1) 



provided the number M of independent identically 
distributed (i.i.d.) random variables Z,"^, ~ Z'', 
m- 1 ... M, is sufficiently large. Here, the idea is: 
if E{Z'^'i) :- J Z'^'idP is akeady known with a good 
precision for / parameter values A', i = 1 . . . /, 
(/ e N>o) and if the law of Z'' depends smoothly 
on A then, given / well-chosen real values af, 
i - 1 . . . /, the standard MC estimator Ef^iZ"^) 
could be efficiently replaced by a MC estimator for 
E{Z^ - 2ii a']Z^') + Yli^i afE(Z'^') that is as ac- 
curate and uses much less than M copies of Z '. 
In other words, if the random variable 

F-* = ^ af(Z'^'^ - £(Z''')) (2.2) 

is correlated with Z'' such that the control of Z'^ by 
F'' reduces the variance, that is if 



V{Z^) 



:=/|Z. 



E(Z'*)fdP> V{Z^ -Y^), 



then the confidence intervals with asymptotic 
probability erf j (a > 0) for MC estimations 



" Y;„ P-a.s. 



m=\ 



M M- 



E(Z'^) (2.3) 



that are in the Central Limit Theorem (CLT) 



(\EMiZ'-P)-EiZ^)\ ^ 
< a 



yjViZ-^ - Y'^)/M 



(2.4) 

converge faster with respect to the number M of 
realizations than the confidence intervals for 



The unbiased estimator ( |2.3[ ) (£(?') = 0) is thus a 
better candidate than ( |2.1[ ) for a fast MC method. 

At a given A e A, we define af, i = 1 . . . /, 
in ( |2.2| i to obtain the optimal variance reduction 
minimizing V{Z'^ - Y^). Equivalently, the control 
variates of the form ( |2.2| i are thus defined as best 
approximations of the ideal control variate Y'^ := 
Z'^ - E(Z^) € that reduces the variance to zero 



inf E 



i=l 



2\ 



(2.5) 



The af, i - 1 . . . /, thus solve a least-squares prob- 
lem with normal equations for ; = 1 . . . / 

Y,c(z'',z'')a]^c{z'',Z'), (2.6) 

where C{Y,Z) = E{{Y - E{Y)){Z - E(Z))) denotes 
the covariance between F,Z e L^. And they are 
unique as long as the matrix C with entries C,j = 
c(Z''',Z'^i) remains definite. 

In general, the computation of a good control 
variate is difficult (the ideal one Y'^ requires the re- 
sult E{Z'^) itself). But we proved in [IJ that control 
variates of the form \2.2\ actually make sense in 
some contexts, a result inspired by ll24l |251 . Let 
us denote by L^q the Hilbert subspace of random 
variables Y e that are centered E(Y) - 0. 

Proposition 1. Consider a set of random variables 
J 

Y'' ^YjgM^Yj^'^'^^^' (2.7) 

where Yj e Lpy, j = 1 . . .J, are uncorrelated and 
{gj)\<j<j are C^g(M) functions. If there exists a 
constant C > and an interval A such that, for 
all parameter ranges A = [/Imin, ^^max] C ]R, there 
exists a C°° dijfeomorphism ta : A — » A where 



sup s.\xp {gjOT-^i^\A)<M\C'" , (2.8) 

\<i<J AeTt^(A) 
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for all M-derivatives (gj o r^')**'' of gj o r^', 
then there exist constants Ci,C2 > independent 
of A and J such that, for all parameter ranges 
A = ^^^J c E, /or all I eN, I > Io> 0, 

inf y(F'' - F) < y(F'') , V/l e A , (2.9) 

where := 1 + ci (TA(/ltnax) - TA^min)) and j = 
spanCi"*.,/ = 1,...,/) uses the I explicit values 

'^'i = (^A(^min) + 'f^ (TA(^max) - TAC'^min)))- 

Prop.[T]tells us that using ( |2.2| l as a control variate 
makes sense because the variance in p.9| l decays 
very fast with / for all Ae h, and whatever A. 

But the explicit construction of A'., i - 1 . . . /, 
in Prop, [l] for random variables Z'' that are an- 
alytic in a ID parameter A, does not generalize 
to higher-dimensions in an efficient way, a well- 
known manifestation of the curse of dimensional- 
ity^ Yet, our numerical results (here and in |T|) 
show our variance reduction principle can work 
well in contexts with higher-dimensional or less 
regular parametrization. Then, given a parametric 
family Z^, A e A, how to choose parameter values 
A'.,i - 1 . . . /, that can in practice efficiently reduce 
the MC computations at all /i e A ? 

Given / e N, let us define A'., i - 1 . . . /, as 
the parameter values with optimal approximation 
properties in U^{L^p^^: the Unear space spanned by 
y^'i - Z'^'i -E(Z'^'i ) minimizes the variances over all 
/I € A, or equivalently the A', i = 1 . . . /, minimize 

inf sup inf y(Z^' - y-*) . (2.10) 

{A[,...,A',}eA' AeA ia})eR' 

The variates F"^', / = 1 . . ./, define an optimal I- 
dimensional reduced basis in L^q for the approxi- 
mations Y"* of the ideal controls F"^ at all A e A. 
The values dj,I e N, in p.lO[ ) coincide with a 



' A tensor-product of {2.9\ in dimension (/ yields a rate 
-I/d. 



well-known notion in approximation theory 1261 : 
the Kolmogorov widths of J/ :- {Z* - E(Z'^),A G 
A) in J/ c LpQ, and are upper bounds for the more 
usual Kolmogorov widths of J/ in L^q 

di inf sup inf \\Y^ - Y\y . 

{F/)e[L^„]' ^eA Fespan(F(,...,F,') 

Explicitly computing Kolmogorov widths is a 
difficult problem solved only for a few simple sets 
J/, let alone the computation of (non-necessarily 
unique) minimizers. But in a many-query para- 
metric framework where the MC computations are 
queried many times for many values of the param- 
eter, one can take advantage of a presumed smooth 
dependence of the MC computations on the param- 
eter to identify a posteriori some approximations 
for the optimal parameter values A', i - 1 . . . /, 
with prior computations at only a few parameter 
values. Moreover, in practice, we shall be con- 
tent with approximations of A', i - 1 . . . / that are 
not very good, provided they produce sufficiently 
fast-decaying upper-bounds of li/ as / — > oo. To 
our knowledge, this reduction paradigm has been 
applied for the first time in (l] to minimize the 
variance of many parametrized MC estimations. It 
is inspired by the (by now) standard RB method 
to minimize the discretization error in many-query 
Boundary Value Problems (B VP) for parametrized 
PDEs (see e.g. and Section[33|). 

Let us define linear combinations like ^2.2) 
when / = by 0. We next define a (non-unique) 
sequence of parameter values A, € A, / € N>o by 
the sucessive iterations / e N of a greedy algorithm 

Am e argsupy(Z'' - V Z^^) . (2. 11) 

For all / e N, the parameter values /I,, / - I . . .1, 
approximate A'., i = 1 . . . / in ( |2.10| i and yield the 
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following upper-bound 
/ 



CT/ := supy(Z'' - y afZ-^') > d, > d, . (2. 12) 

Using a finite discrete subset A c A (possibly A = 
A in case of finite cardinality card(A) < oo) and a 
MC estimator for the variance, like 



m — I 



m=l 
/ 



(2.13) 



we can finally define a computable sequence of pa- 
rameter values Ai e A, i e N>o by sucessive itera- 
tions / € N of a weak greedy algorithm at w e Q 



~Am e argsupyM(Z-^ - ^Z^O(w) . 



(2.14) 



For all / e N, our computable approximations 
of the optimal parameter values A', i - 1 . . . / 
used hereafter in practical applications will be A/ = 
Ai(co), i - I ...I, from a set like in ( |2.14| l. Note that 
the Ai, i e N>o, are constructed iteratively and thus 
do not depend on / but only on Ai (and possibly on 
A), which is useful in practice: the adequate value 
of / for a good variance reduction is typically not 
known in advanced and one simply stops the weak 
greedy algorithm at / e N when, at a given w e Q, 

/ 

a-, := supyM(Z-' - ^ a-Z'^O(w) < TOL (2.15) 

is smaller than a maximum tolerance TOL > 0. 

Even if greedy algorithms might yield approx- 
imate minimizers of p.lO| i that are far suboptimal, 
numerical results show that they can nevertheless 
be useful to computational reductions using the RB 
control-variate MC method, just as they already 
proved useful to computational reductions with the 



standard RB method in numerous practical exam- 
ples. Recent theoretical results Il27l |28l also sup- 
port that viewpoint as concerns the standard RB 
method and can be straightforwardly adapted to 
our framework. Comparing directly ct/ with di at 
same / e N will not, in general, give estimates bet- 
ter than ctj < ^dj. And this is pessimistic as re- 
gards the convergence of the greedy algorithm in- 
sofar as it predicts variance reduction only for sets 
with extremely fast decaying Kolmogorov widths 
dj. Yet, the two sequences crj and dj typically have 
comparable decay rates as / — > oo, and it holds for 
adequate rj,/], c,C > given any do, a,a > 0: 

a) d, < dol-" ,MI^ crj< Cd^r" , V/, 

b) di < doe-"''" ,MI^o-i< Cdoe-'^'^" , V/, 

c) d, < doe-"''" ,MI^ cr,< CIr]'e-"''" , V/, 

where c) is sharper than b) if, and only if, a > 1 
or a = 1 and a > ln2. So, when the Kolmogorov 
widths di decay fast (variance reduction is a priori 
possible), greedy algorithms can be useful if used 
with / sufficiently many iterations]^ 

The latter results can also be adapted to the 
weak greedy algorithm, like in Il28l . For all e G 
(0, 1) and / e N, given any 6j e (0, 1), we define 
the smallest number of realizations M^^'^^ such that 

P (l VmjKZ' - ?'') - V{Z^ - Y^)\ < Old,) > 1 - e 

holds for all A and control variates of dimension 
< i < I. Of course, we do not know exactly 
Mjg'j in practice, which could even not be finite if 
variations in A are not smooth enough. But if, for 
all e e (0, 1), we assume that one can use more 
than Mjg^j reaUzations in the computations of step 



^ Then computational reductions are possible, but only for 
sufficiently many queries in the parameter of course, as we shall 
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/ of the weak greedy algorithm, then the following 
events occur with probability more than 1 - e, for 
adequate r],/3,c,C > given any t/o, a,a > Q: 

a) di < dol-" , V/ 0-, < Cdol-" , V/, 

b) di < doe-"''" a-i< Cdoe-''^' , V/, 

c) di < doe-"''" , V/ ^ a-/ < Clrj'e-"''" , V/, 

see [Appendix A| for a proof. Of course, the con- 
stants C in the upper-bounds above can be overly 
pessimistic. But our numerical results show that 
one does not need to go to the asymptotics / — > oo 
in order to get good variance reduction results. 

2.2. Implementation and analysis 

At any step / € N of a weak greedy algorithm, 
whether we are looking for a new parameter value 
Ai+x or not {TOL reached), the practical imple- 
mentation of our RB control-variate MC method 
still requires a few discretization ingredients. In 
particular, one has to be able to actually compute 
approximations of E{Z^) = E{Z^ - a'^z}') + 
Yli^^ afE{Z'^') and ViZ-^ - P) whatever A e A. 
We are now going to suggest a generic practical 
approach to that question. On the contrary, practi- 
cally useful choices of A c A specifically depend 
on the parametric context and will be discussed 
only for specific applications in the next sections. 

First, we need to approximate the expectations 
E(Z^'), i = 1 . . . /. To obtain good useful ap- 
proximations of EiZ'^') in general, we suggest to 
use an expensive MC estimation £'m,,„„,(-Z'^')^ inde- 
pendent from subsequent realizations of the ran- 
dom variable Z*'. Although this is a priori as 
computationally expensive as approximating well 
E(Z^) at any Ahy a naive MC estimation, we do 
it only once at each of the few A/. Once a real- 
ization of EMi,„,SZ'^') has been computed at step / 
of the weak greedy algorithm (1 < i < I) with 



Marge s> 1 realizations, the deterministic result 
can be stored in memory. We denote the com- 
putable approximations of F'*' = Z"^' - E{Z^') by 

= - Em,„,JZ-^'), = 1 . . . /, so that the ac- 
tual control variate used in practice is in fact 
a linear combination in a linear space of dimen- 
sion / spanned by 1 . . . /. Note that given 
(Z'^O we expect realizations of Y^' to be com- 
puted concurrently with realizations of Z"^, using 
the same pseudo-random numbers generated by a 
computer, with approximately the same cost. 

Second, for any A, we need to replace the coef- 
ficients af,i- 1 . . . /, with a tractable solution af 
of the minimization problem p.5| l, so that practical 
control variates in fact read 
/ 

F-i = J^Q./?^' . (2.16) 

i=i 

Now, the MC computations at any one specific A 
should be fast, in particular the approximate so- 
lution to the minimization problem ( |2.5| l, whether 
we decide to stop the weak greedy algorithm at 
step / and use only / parameter values or still want 
to explore the parameter values in A to select a 
new parameter value Ai+i. So we invoke only 
a small number Msmaii of realizations of Z* and 
to compute the af. Note that there are differ- 
ent strategies for the numerical solution to a least- 
squares problem like p.5| l that use only Msmaii re- 
alizations of Z'^ and Y"^'. One can try to com- 
pute directly the solution to an approximation of 
the linear system ( |2.6[ ) where variances and covra- 
iances are approximated with MC estimations sim- 
ilar to ( |2.13| l using Msmaii i-i-d. realizations Z;,' and 
Y;1;, m = 1 . . . Msmaii, of Z-* and P'. The MC es- 
timations similar to p.l3| l are indeed interesting 
insofar as they remain positive, see ||29l l30l for a 
study of consistency. But there may still be dif- 
ficulties when C is too ill-conditioned. One can 
thus also apply a QR decomposition approach to a 
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small set of Msmaii realizations, using the Modified 
Gram-Schmidt (MGS) algorithm for instance]^ In 
any case, reahzations of Z'* and Y'^', i = 1 . . ./, 
should be correlated and thus computed with the 
same collection of random numbers in practice. 
Besides, one can use a single collection of ran- 
dom numbers (generated by initializing the ran- 
dom number generator at one fixed seed) for all A, 
including A/, i - 1 . . . /, and realizations of Y^' can 
thus be computed only once (at step / of the weak 
greedy algorithm) and next stored in memory. 
Finally we approximate E{Z'^) by one draw of 



1 



(2.17) 



whatever A and compute a MC estimation ( |2.13| l of 
y(Z'' - F"*) with the same Mtest realizations of Z' 
and Y'''. Remember that at step / e N of the weak 
greedy algorithm, ( |2.17[ ) is useful either to inspect 
all parameter values A e A, check whether TOL is 
reached and next select a new parameter value Aj+i 
when it is not. Or if TOL has been reached before, 
then ( |2.17| l is used for definitive estimations at any 
A (still with companion variance estimations ( |2.13| l 
to concurrently certify that TOL is maintained). 

Let us compare our strategy with the cost 
of direct MC estimations that have same confi- 
dence levels and thus use M realizations such that 
VMiZ'^)/M = VM,,JZ^-Y'')/Mi^,t. We denote by C 
the cost of one realization at one parameter value 
A compared with that of one multiplication. The 
evaluation of Msmaii + Mest realizations at each A, 



plus + /Msmaii multiplications for the QR solu- 
tion to the least-squares problem, /Mtest multipli- 
cations for the MC estimation of the output ex- 
pectation, and Mjgjjt for the variance, make our RB 
control-variate MC method interesting as soon as 
MC+M^ > (Mtest+Mstnaii)(C-H/)-H/2-HMt2^t' at least 
for "real-time" purposes. Then, the price of identi- 
fying / control variates with the weak greedy algo- 
rithm pays back if one needs to compute very fast 
EiZ'^) for any A, provided the / control variates still 
provide good variance reduction in the case where 
they were constructed by a weak greedy algorithm 
trained on ^ C A. In addition, the RB control- 
variate MC method is also interesting in the many- 
query cases where one has to compute E(Z'^) for a 
sufficiently large number jj/l of parameter values A. 
At each greedy step / = 1 . . . / - 1, in addition to 
variance estimations at each /I € A, the selection of 
Aj+i requires a quicksort of the sample of estimated 
variances {Vmks, (•2'* ~ ^'') > ^ ^ plus the com- 
putation of £'m,„j,„(Z'''+') and Ms^aii realizations of 
Z'*'*' to be stored. The cost of the greedy algorithm 

Cgi-eedy = Card(A)(/(Mtest + Ms,nall)C -H /(/ -H 1)(2/ -H 

l)/6+/(/+l)Ms,t,aii+M,est/2+/Mtit+lncard(A))+ 
/(Miarge+Msinaii)C is then Compensated by variance 
reductions as soon as '^A > Cgi-eedy/(MC + - 

(Mtest + Ms,™ll)(C + /) - /2 - Mlj. 



' The matrix with entries (?m )m=l...Ms„ai],;=l.../ uniquely de- 
composes as QR where Q is sl M X I matrix such that Q^Q is 
the I X I identity matrix and R is an upper triangular I X I ma- 
trix. The /-dimensional vector 5' with entries a'*, i = I ...I, 
useful in ? ' is next defined as R"' Q^Z (where Z is the Mj,^;,!!- 
dimensional vector with entries Zj'J,). The aj are thus random 
variables depending on Msmaii realizations and an approximate 
minimum of (23) is Z^Z - Z^QQ^Z. 



We also have a posteriori error estimates. For 
values af given by a fixed MC estimation with 
Msmaii realizations independent of Mtest new real- 



8 



izations in ^A7\ , CLT (|24| hold^ Va > 0, for 

( VV(Z-' - y'')/M,es, ~ , 

(2.18) 

where the MC empirical estimator £M,est(Z''~ ^'^) is 
defined in ( |2.17| i and where one can replace V(Z^ - 
Y'^) by a MC estimator Vm,,,S^'^ - by Slut- 
sky theorem. Furthermore, although ( |2.18| l is not 
a full error analysis because it does not take into 
account the bias ^^^i afE(f) = ^^^i Q:f(E(Z'^<) - 
EMy^^^iZ'^')), a function of M|^_^ and M^^^w precom- 
puted realizations, probabilities like ( |2.18| l are con- 
ditionally to £'m|.,„,.(-Z''') (and to a'^), so Bayes rule 
applies and it also holds, for all a, a, > 0: 



Ml 



large 



-(^)n-(i) 



Afiarge>M,es,-»oo 



where variances have been replaced by estimators. 

Last, to get a full convergence analysis that 
predicts an efficient variance reduction with the 
RB control-variate MC method in practice, at least 
when A = A, one should take into account all re- 
alizations in which in fact reads F;* = 

iVl^l^;|l|,iVl|;|i-j;j. 

Zli ff'M,„,„„ (Z'' - Em,„JZ''))- Yet, note first that 
in the weak greedy algorithm ( |2.14| i, only the 
MsmM realizations introduce new statistical error. 
And second, one can again predict that, with a 
good probability, the greedy algorithm is robust to 
discretization (in the same sense as in |28|) pro- 
vided the realizations of the least-squares problems 



* If the Miest realizations in j2.17) are the same as the A/s„,all 
ones used to compute aj, then j2.17^ is not a MC empirical es- 
timator of the type Em^^sAZ'^ ~ ^^)- I' '^"^^ independent 
realizations of the random variable Z' - 7'*, so the CLT does not 
hold and it is difficult to give a rigorous quantitative estimate of 
the statistical error. 



are not too close to rank-deficient and their numer- 
ical solution is close to the solution. We do not 
state this more rigorously but instead refer to ESll . 
whose results can be adapted in the same way as 
in [Appendix A| for the week greedy algorithm. 

3. Application to Uncertainty Propagation 

In this section, we numerically demonstrate the 
efficiency of the RB control-variate MC method 
for uncertainty propagation in a representative UQ 
framework. As example, we consider a PDE 
parametrized by stochastic coefficients and other 
non-stochastic coefficients which we term control 
parameters. The goal is to compute fast many ex- 
pectations of a scalar output of the random PDE 
solution for many values of the control parameters. 

For the numerical simulations, as usual in UQ 
frameworks [3], we use stochastic coefficients that 
are random fields with a Karhunen-Loeve (KL) 
spectral decomposition. In practice, one can only 
use representations with a finite-rank K of course 
and the MC method allows one to simulate the law 
of the random field just by generating realizations 
of the K random numbers in the KL decomposi- 
tion. We compute approximations to the realiza- 
tions of the random PDE solution with a standard 
FE discretization method (the same one whatever 
the realizations). The expectation of the random 
scalar output can be computed for many values of 
a control parameter just by reiterating many times 
the MC procedure. But this is costly. So first, we 
do not actually compute the PDE solution with the 
FE method at each realization of the stochastic co- 
efficients and for each control parameter value. In 
fact, we replace it with a cheap reliable surrogate 
built with the standard RB method Uke in il4l 
Yet, even with a cheap surrogate model instead of 
ffie full FE, the MC method is still costly, because 
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computing the expectation of the parametrized out- 
put under the input probability law still requires a 
very large number of realizations. This quickly be- 
comes untractable as soon as one has to do it many 
times for many values of the control parameter. 
Then, we use the RB control -vari ate MC method 
to decrease the number of MC realizations needed 
at most of the control parameter values. 

Compared with fT4T|, one uses here an im- 
proved MC method to show how various RB ideas 
can be combined to efficiently tackle some uncer- 
tainty propagation problems for partial differen- 
tial equations with stochastic coefficients. We thus 
especially discuss the practice of the RB control- 
variate MC method here. Yet, although the exam- 
ple is exactly the same as in [14], the specific use 
of the standard RB method in the frame of UQ, 
and in combination with the RB control-variate 
MC method, requires special care. That is why at 
the end of this section we briefly expose our im- 
plementation of the standard RB method, without 
which we cannot use the MC method at a reason- 
able price, and next discuss our specific use of it. 

3.1. An elliptic PDE with stochastic coefficients 

Consider first the solution m to a scalar Robin 
Boundary Value Problem (BVP) in a regular do- 
main D c M^: u satisfies Laplace equation in 2) 



■ div (a-Vm) = , 



(3.1) 



and conditions of third type on the boundary dD 

K{n ■ V)m + bu-g, (3.2) 

where div and V denote the usual divergence and 
gradient operators in D equipped with a cartesian 
frame, n the outward unit normal on dD, and k, b 
and g are scalar parameter functions. For the sim- 



ulations we will choose in particular 

K ^ k,\o, + hlo, (h,k2)eRla (3.3) 



8 = ^Ifr 



b e L°°{dD,R>o) (3.4) 



8 e L^dD) 



(3.5) 



where Ij). is the characteristic function of £),-, 



©I n ©2 = ©I U ©2 c £) c £)i U ©2 , 
and the boundary decomposes into subsets 



FBnrR^o FBurRC^© Fa, = 5£)\Fb u Fr . 

There exists a unique weak solution u e H^{!D) 
to p.l| - [3.2| l, which can also be defined as the 
unique solution to the following variational prob- 
lem ED: Find u e hHD) / Vv e H\D) 

I kiVu-Vv+ I A:2VmVv+ | buv ^ I gv . 

JDi JVn Jfb Jfr 

(3.6) 

As output, we consider the compliance]^ 

s = l(u) -I gu . (3.7) 

The accurate numerical discretization of p.6[ ) is 
standard, for instance using the Finite-Element 
(FE) method. In what follows, we shall invoke 
continuous piecewise linear approximations de- 
fined in conforming, regular FE spaces of //'(£)). 

To fix ideas, one could think of u as the tem- 
perature in a fin subject to a constant radiative flux 
g on Fr (in contact with a heat source) and to con- 
vective thermal exchanges on Fb, see Fig. [T| The 
function b determines the value of the Biot number 
along Fb, that is the intensity of local heat trans- 
fers on the part of the boundary in contact with 



' For symmetric problems like jXSj, the output is a 
particularly simple choice because it allows a simple accurate a 
posteriori error estimation without invoking a dual problem, but 
neither this choice nor the symmetric character of the problem 
are limitations to our approach, see e.g. L32J . 
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Figure 1 : Thermal fin geometry D used in tlie numerical sim- 
ulations (Fn = dD\(rji U Fb) denotes the boundary subject to 
homogeneous pure Neumann conditions). 



a fluid. Now, the Biot number is typically un- 
certain. We shall model it as a random field in a 
probability space (Q, ;F, P) and one is typically in- 
terested in quantifying the uncertainty propagated 
in s. More precisely, we define an essentially 
bounded function b € L^(Q, L°°(rB, K>o)), posi- 
tive almost everywhere (a.e.) on Fb with probabil- 
ity 1 (P-almost surely or a.s. in abbrev.). Then, 
with a slight abuse of notation, u now denotes 
the solution in U^(Q.,H\D)) to Kl 
b e L~(Q,L"(rB, 



£>())). And, for many values of 
the control parameter ki, we consider computing 
expectations E{s) - ^ sdP under P of the output 
s. To that aim, we will propagate the uncertainty 
(random "noise") from b to the scalar (random) 
output variable s e L'^{D) by simulating the law of 
u. We recall that here we focus on MC discretiza- 
tions of the noise. Compared with other discretiza- 
tions 13311341 . it is very easy to implement, in par- 
ticular because it is not intrusive with respect to ex- 
isting discretizations of the deterministic problem. 



although the convergence is a priori much slower 
when the stochastic variations are smooth lfT2l[T3]| . 

We consider a bounded second-order stochastic 
process b as random input. We denote its constant 
mean value on Fb by E (/jIfb) = E > Q, the spatial 
correlation length between random fluctuations by 
6 > Q and the relative intensity of the (bounded) 
random fluctuations around the mean by T > 0. 
More precisely, like in lfT4l . the random field b 
shall be the limit in L"(Q, L°°(rB)) of random 
fields bj^ whose covariance defines a kernel oper- 
ator of finite rank K in L^(Fb) and which can be 
easily simulated with a MC method using K inde- 
pendent random scalar parameters in a priori given 
ranges. For the numerical simulations, we choose 
a gaussian kernel c(x,y) - exp^-|x -yp/^^) /|Fb| 
to define a covariance operator in L^(Fb), where 
x,y are in the same connected component of the 
boundary Fb in Fig.[T] The covariance operator has 
eigencouples (/}t,<l)j.), k e N>o and a normalized 
trace Yi'k=\ '^k - ^- Then, invoking uniformly dis- 
tributed independent variates ~ Hi- V3, V3) 
for each spectral mode k e N>o (the same on each 
size of the boundary Fb in Fig. [TJfor the sake of 
simplicity), we define b by its KL representation 



1 + 



iteN>o 



(3.8) 



3.2 1 when which makes sense since truncated representations 



bK^E 



K \ 

k=\ 



(3.9) 



converge in L],{'D.,L^{YYi)), and in L^(Q,L"(Fb)) 
here (see 1351 l36l e.g.). We also require bK > 
b,„i„ £/2 > a.s. V/T e N and fix Y := .1 
for 5 € (.05, .5). Then one can define well-posed 
EVPs: Find uk e U^{Q.,HHD)) such that f-a.s. 

K{n ■ V)uk + bx Uk = g on dD , 

-di\{KVuK) = QonD. (3.10) 



11 



Approximations uk converge to u in L^(Q, H^{D)) 
as — > oo and for a.e. w e Q, realizations ugico) e 



H {D) of the solution to (3.10i solve a variational 



formulation with test function v e //' (£)) 

(w) V 



+ E \ e UK(a>) V - I 
Jfb Jr, 



(3.11) 



parametrized by k\,k2,E and reaUzations 
T y/A^Ziiioj), k = 1 . . . /T. So finally, in prac- 
tice, we shall compute approximations u/^^k to 
u with the FE method defined previously, which 
converge in Lp(D.,H^(D)) as N ^ 00, K ^ 00. 
We require N and K large enough such that, for 
a.e. values of the control parameter ki,k2,E, a 
tolerance level \s - .syv./fl ^ TOL\s\ is reached a.s. 
for the approximate output s/^ k ■- L Myv/f- Next 
one can easily simulate the law of s/^ k with the 
MC method by mapping realizations Z^((jj) of Z^., 
k = 1 ... ^T, to realizations s/^ k{<^) of iyv.A:- 

But a direct MC-FE method is very expensive 
computationally, since one has to compute many 
FE approximations w^v./rCw), for many realizations 
of Z^(cLt), and next for many values of /I = (k2, E). 
That is why, like in lfl4l . we replace the FE approx- 
imations u/^ K by cheaper surrogates Uf^ K,N con- 
structed with the standard RB method (wee will 
come back to this point in Section [33] and the non- 
expert reader can find a brief exposition of this 
technique there). At present, the RB method is one 
of the only few existing alternatives that can tackle 
some "high-dimensional" problems in more than 
2 or 3 dimensions, and it proved usefuj^when the 



* Note that it does not a priori require specific discretizations 
of the parameter space and speed-up can be increased for goal- 
oriented purposes like computing i accurately instead of u. 



parameter is not too high-dimensional, for instance 
when each of the parameter components ky,k2,b 
and g are scalars. But when the parameter becomes 
high-dimensional, difficulties arise again at some 
point. To some extent, the RB method still showed 
efficient for MC simulations of a random field b 
with a moderately large number K of modes IIT4ll . 
But there are situations where the approach is still 
computationally too expensive, in particular when 
one wants to explore variations of the MC simula- 
tion with respect to the control parameter k2 and E 
for instance. That is why, in this work, we would 
next like to show how to further improve the com- 
putational cost in cases where one is interested in 
computing many values of the expectation E{s) of 
s under the law of the random field b for many val- 
ues of the control parameters k2 and E. To this aim, 
we build on the standard RB ideas and use a MC 
method with a reduced basis of control variates. 
In particular, at a given value of A, one still needs a 
large number of RB approximations u/^^k,n to com- 
pute accurate MC estimations of E {s/^ k,n), which 
is too costly when it has to be done for many values 
of A. We thus now try to reduce the computational 
cost by invoking still another computational reduc- 
tion technique, the RB control-variate MC method. 
We will show that the number of realizations re- 
quired by the MC method to reach a given statisti- 
cal accuracy for E (sat.a'.a') ^t one parameter value 
A can be efficiently decreased with our RB control- 
variate technique, in the limit where one has to 
compute sufficiently many expectations for suffi- 
ciently many values of A. 

3.2. RB control-variate MC estimations 

We require a relative precision TOL - 10"^ 
for the numerical approximations E(s/^ k,n){A) of 
the output expectations E{s){A) at any A e A := 
(.1, 10) X (.1, 1). It is thus sufficient for the PDF 
solution M to be a.s. approximated with a relative 
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mean 




Figure 2: Maps of E [s(k2, £)) and V [s{k2, E)) (numerical ap- 
proximations) for {k2, E) e (.1, 10) x (.1, 1). 

precision tol - 10"^ here. To this aim, we first 
define Pi FE approximations on a 2D simplicial 
mesh using FreeFem++ by F. Hecht and his col- 
laborators. (Refining a coarse mesh, N - 3333 
nodes - 6272 triangles - seem sufficient.) 

We first fix (5 = 0.5. By a standard "variational 
crime" analysis (see e.g. 1 14|), it is enough to use a 
truncated KL expansion Bk^l^^oK - 10 (such that 
Y.k>K ^ tol Y^ken^o V^)- We build "offline" 
(that is, before the MC simulation) a reduced basis 
for a small linear spac^of surrogates u/^ k,n with 
dimension = 12. Then we simulate the law of 

= iyv,A:,A'('^) at any A e A with the MC method. 
This allows us to retrieve a MC estimation of E{Z^) 
and to construct a response surface by piecewise 
linear interpolation of the MC estimations Em{Z'') 
at the 10 X 10 trial values of A like in Fig.|2j 

' A weak greedy algorithm needed N = 12 steps for the a 
posteriori error estimates to satisfy ||»w,a' - "a/.at.wIIhi ^ ^ 
tol = 10"^ at all training parameter values in a cartesian grid 
using 2 trial values for each parameter component t ^fX^Zk, 
k = I . . .K, and 10 trial values for each parameter component 
A\ = ki, Ai = E, see Section [33] for some details about the RB 
method. 



To compute the response surface of Fig.|2]with 
a good confidence level in the MC estimations, say 
with ylVM{Z^)IM < TOL x £'m(Z'') at each of 
the A used for interpolation, one needs more than 
M - 10"* realization^ at each A. Now, the law of 
Z'' with respect to A is smooth. Let us use the RB 
control-variate MC method with A defined by the 
10 X 10 trial values of A needed to build the re- 
sponse surfaces in Fig. [2] Then, we can achieve 
MC estimation with confidence intervals of ac- 
curacy TOL using a large number of realizations 
Marge - lO** at Only 7 = 3 values of A, and much 
less reaUzations at the other values of A to achieve 
satisfying confidence intervals. More precisely, 
Attest = 10 is enough, since one can reduce the 
variance to 10 with 1-3, and one already has a 
good MC estimation of the magnitude of the vari- 
ance then. One also needs to compute coefficients 
af at each A with Msmaii additional realizations, but 
in practice one can use the same Mtest = small re- 
alization^ So the marginal gain for each A in the 
finite sample of queries A is to divide the computa- 
tional cost by 10'*/(10+/xl04/Carc/(A)) > 30 (the 
cost C of one realization is dominant here, even 
with RB surrogates). 

Moreover, our algorithm is robust with re- 
spect to the specific choice of A above. That 
is, the control variates identified previously can 
still be used to reduce the variance of Ef^{Z'^) 
at other i in A than those in the trial sample 



* Besides, one can check that M = W* is sufficiently- 
many realizations here for the speed of convergence to be quite 
well evaluated by CLT, since the MC estimator Eu(Z'^) is al- 
ready close to gaussian according to the Kolmogorov-Smirnov 
goodness-of-fit test. In particular, most often, one single draw 
of the MC estimator already allows one to compute a good ap- 
proximation to a confidence interval. 

' Although introducing some dependency between the Mtest 
and Msmaii realizations theoretically influences the possibilty to 
obtain a CLT for instance, it hardly changes siginificantly the 
numerical result for one MC estimation in practice. 
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A used by the greedy algorithm above. Let us 
evaluate the gain for real-time purposes or in the 
limit of many queries in A when one can for- 
get all the precomputations including the greedy 
construction. Then the computational cost per A 
is approximately divided by 10"*/10 - 10^, pro- 
vided the reduced variance is still approximately 
of magnitude 10 for all A e A. This observa- 
tion even holds if we require a higher precision 
(hence smaller tolerances TOL, tol and correpons- 
dingly increase "offline" the dimension of the 
RB space for the PDE surrogates). Then, the gain 
grows like TOL^^ in the infinitely-many-query 
asymptotics as long as one can reduce the variance 
by adding finitely-many control variates; and like 
TOL-^{\ - / X TOL-^ICard(k)) if one takes into 
account the greedy construction for a finite sample 
of Card{A) » / x TOL^^ parameter values, where 
/ is the minimal number of control variates re- 
quired to achieve a reduced variance of magnitude 
TOL in A. (Note that the computational cost of 
one realization becomes even more dominant as 
increases.) Here, Fig.|3]shows that that our greedy 
construction is robust: with control variates built 
using the 10 x 10 trial values of A above, we did 
thorough MC estimations of the variance for 100 
other parameter values. In our example, it roughly 
holds / < -log(rOL) for TOL > 10"^ and the 
computational gain grows exponentially fast like 
TOL-\l + TOL-^\og{TOL)ICard(K)) with the 
number - \og{TOL) of accurate digits required for 
sufficiently large Card{A). 

In any case, if one cannot reduce the vari- 
ance (or equivalently, if / needs to be huge for 
any useful variance reduction), there is - almost 
- no loss of effort in using the RB control-variate 
MC method compared with the naive MC method, 
since would simply observe the absence of vari- 
ance reduction using Mtest reahzations and would 



next need to increase the number of realizations to 
the same number of realizations as the naive MC 
method. Only the computation of coefficients o-^ 
would have been unnecessary, which is not much 
compared with a large number of realizations. 

Of course, the efficiency of our RB control- 
variate MC method is limited by the cost and ac- 
curacy of single realizations, like any MC method. 
This has nothing to do with its ability at reducing 
the variance (and thus controlling the number of 
realizations needed for accurate MC estimations), 
but it is a practical limitation everyone doing sim- 
ulations is confronted with. For UQ in particular, 
if we use a smaller correlation length 5 - 0.05 but 
require the same tolerance TOL, the "variational 
crime" analysis above requires a KL truncation at 
a much higher-order K - 70. Now, it is practi- 
cally impossible today (in 2012) to carefully in- 
spect all 2 -H = 72 directions with a fine grid of 
trial values for the "offline" construction of RB sur- 
rogate^^ So, even though the RB control-variate 
MC method still works perfectly well with inac- 
curate RB surrogates u/^ k,n, there is little com- 
putational gain possible on the whole simply by 
reducing the variance when the statistical error is 
dominated by the KL truncation error (there is no 
point in getting accurate MC confidence intervals 
if the RB surrogates are not good enough). The 
RB control-variate MC method is not a definitive 
cure to the "high-dimensions" problem for PDEs 
with stochastic coefficients which we already men- 
tionned previously. 

All existing numerical approaches to UQ prob- 
lems that invoke a KL decomposition of the input 



The computations presented in this work have been done 
in a very reasonable time, from a few minutes to one hour on a 
single processor unit, but inspecting 72 directions with 2 points 
per directions, that is 2^2 « 10^' points, would require yers of 
computations. 
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Figure 3: Top: maximum, mean and minimum in a sample of 
MC estimations Vm.^j, (Z'' - afY') at various A, as a func- 
tion of /. Tlie decrease rate is less fast among parameter values 
A not used by the greedy algorithm, but it is still reasonably 
good. Middle: confidence intervals with probability 99% and 
95% for the MC estimation (Z'' " Q-;* of EiZ'^) as a 
function of / at one A. Bottom: confidence intervals with prob- 
ability 99% and 95% for the MC estimation Em(Z-^) of £(2-*) 
as a function of M at same A, with a much slower decrease rate 
than above at a similar computational cost. 



noise are limited in practice by the "curse of di- 
mensionality" in K anyway, in so far as they re- 
quire numerical approximations of realizations of 
the PDE solution in a space of high dimension K. 
(Most often in UQ people only use K < 10 in 
practice 135] |7l [37l.) So is our approach. But 
in any case, when one can tackle the problem of 
simulating high-dimensional reaUzations, say only 
by expensive precomputations, then using our RB 
control-variate MC method and the PDE solution 
as a black-box still makes sense: the simulation 
of one reaUzation of Z ' at one given A is typically 
all the more expensive as the parameter size in- 
creases, for instance just by assembling the matrix 
of the discrete BVP here, and the gain obtained by 
variance reduction (if any) is in fact all the more 
interesting. Moreover, there exist other ways of 
simulating the law of b that do not require a KL 
representation (thus a limitation in K), even extend 
to other random fields than p.8| l, and can be used 
with our RB control-variate MC method ll38l . This 
might be a cure to some problems but we keep its 
investigation for future works of ours. Last, for 
the model problem here, we are in fact quite fortu- 
nate, as already noted in y_4J, that the KL spectrum 
has a fast decay. The first KL modes span all the 
solution space when the required precision is not 
too small. Then we can still build "offline" good 
enough surrogates uj^^ ^.n with a very small train- 
ing set of trial parameter values, as we are going to 
see in the next section. Furthermore, since the pre- 
vious observation holds only for specific choices 
of the spatial correlations, before shifting to the 
numerical proof of efficiency for the RB control- 
variate MC method in another UQ framework, we 
are also going to mention one promising way of 
tackling the simulation problem with high-order 
KL truncations which is currently under study, and 
where the RB control-variate MC method can still 
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be useful. Our conclusion of the first example 
is thus mostly concerned with an efficient com- 
bination of the standard RB method with the RB 
control-variate MC method in a UQ framework. 

3.3. RB control-variate MC with RB surrogates 

We briefly recall the "standard" RB method 
used above (see also ll39l l40l 1411 e.g.) before dis- 
cussing its specific use in UQ with a MC method 
like our RB control-variate MC method. The 
reader already expert in the standard RB method 
may want to skip the next subsection. 

3.3.1. The standard RB method 

The goal of the RB method is to reduce the 
computational cost of solutions u{jj.) to a PDE 
parametrized by fi when m(ju) has to be computed 
many times for many values of fi. Here we set /i = 
{k2,E, Y ^/AlZl{^o), . . . , Y ^fAKZK(cJ)) and given a 
Hilbert space X - H^{D) with inner-product 

(m, v)x = I Vm ■ Vv -H I Mv , Vm, V e X 

Jv Jfb 

and with energy norm ||m||x - ^l{u, u)x, we denote 

a(u,v; jj.) — ki I Vu-Vv+k2 I Vm-Vvh- I buv 

a bilinear form that is elliptic Vm, v e X. Then, for 
parameter values in a given range, we consider 
the solutions u{fj) € X to p.6[ ), such that 

a(MOu),v;A/) = /(v), VveX, (3.12) 

with outputs s{fi) - l(u(fj.)). In practice, we as- 
sume that one can compute a good approximation 
of u(fi) in a linear subspace Xj^ c X with large 
dimension TV » 1 for any fi. Moreover, here 
we discretize b x bx so the fully computable ap- 
proximations also involve a discretization param- 
eter K and is denoted u/^^k(m) g X/^. Comput- 
ing U/^^Kifj) for many /i is costly and the goal of 



the RB method is to construct a linear subspace 
Xfj c Xyv with smafl dimension N N such 
that, for all fi in the given range, some approxima- 
tions uu.k.n(I-i) £ Xn to M(yu) e X can be computed 
faster The RB method invokes linear approxima- 
tions Uf.jx.N(t^) TI^=i y„{fj.)u(fi„) where the coef- 
ficients jniji), n - 1 ... A^, are computed fast at any 
yu € A by the Galerkin method in Xf^: m - I . . .N 

N 

^ yn(^i)a(u(li„), u(p„,y,jj) = l(u(^,„)). (3.13) 

n=l 

The Galerkin approximation error ||Myv,A:(/^) - 
uu.k,n(m)\\x can be estimated a posteriori by a fully 
computable error estimator Af^i/j.) 

,, ,, ^ a(uu,K,N(j^),v;fi)- l(v) 

\\UN,K - Un^k,n\\x < sup — 

l|v.|b=i o:lb(m) 

(3.14) 

using a lower bound aisilJ) for the coercivity con- 
stant of the bilinear form, which is useful for the 
certifiability of the reduction method as well as 
for the selection of adequate parameter values //„, 
n - 1 ... A^, in the range of interest. Here, given 
the choice of inner-product on X - //'(£)), one 
can compute explicitly and at Uttle expense quite a 
sharp lower bound aisitJ) - min(fei, A;2, inf b) that 
is uniformly good with respect to the parameter 
values and for any admissible parameter range. 

A good reduced basis u(jji), . . . ,u(fij^) span- 
ning Xn can be found in practice with a greedy 
algorithm like ( |2.1 1[ ) |27 , 28 1. The standard proce- 
dure selects parameter values fxi,. . . ,fiN incre- 
mentally in a trial sample for /i as follows: given 
any pi we next define for n = 1, . . . , - 1 

p„+i e argsup A„ip) . (3.15) 

Note that the bilinear form a is affine with 
respect to functions of the parameters (viz. 
parametrized by p only through coefficients in a 
linear combination of bilinear forms, recall ( |3.1 l| l). 
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This is useful to efficient RB implementations in 
practice since the parameter-independent matrices 
can be precomputed. It allows fast evaluation^of 
un,k.n{p) and Amiij) at any /i. Here, the quality of 
our RB approximations is similar to that in [14J. 

3.3.2. Certified RB contml-variate MC 

First, note that to get certified results in the 
frame of MC simulations, one should check the 
quality of those RB approximations when they are 
used, since in any case the precomputed basis for 
the RB surrogates invoked at each A and for each 
realization has been trained "offline" only with a 
few trial parameter values in (possibly forgotten 
afterwards). In the RB control-variate MC method, 
it is necessary to check the RB approximation er- 
ror with p.l4| l at least for the Mia,ge reaUzations 
in £m,,__^^(Z''0; ' = 1 . . ./, at the parameter values 
Ai where control variates are constructed plus, for 
each A, at the (1 -i-/)Mtest realizations of Z'' and Z '', 
/ = 1 . . . /, that are used to compute the final MC 
estimation with reduced variance. In practice here, 
this increase of computational cost per realization 
and A was not much (that is, only a fraction of the 
total computational time without certification). 

Moreover, for 6 - 0.5 and K - 10, we precom- 
puted a reduced basis of dimension N - 12 for 
the PDE solutions using only a cartesian grid of 



Denoting by M the matrix of the inner-product (■,-)x 
computed for the discrete FE space where the parametrized 
BVP reads A(ii)U(p) = B, the RB approximation enw 
in the "energy" norm ^{U(ii) - U „(li)Y M{U (fi) - U„(p)) 
is evaluated by the residual estimator 
y/{A(p)U„(p) - B)TM-HA(p)U„(ji) - B)/atB(p)- Now, 
(A(ti)U„(ji) - Bf M-\A(ti)U„(p) - B) is a quadratic form in 
the small-dimensional vector U„(p) containing the coefficients 
of the approximation in the reduced basis. And its entries are 
themselves quadratic in (functions of) /j when the entries of 
A(ji) depend linearly in (functions of) //. The latter can thus be 
precomputed for a given reduced basis and next be assembled 
in 0(N-K-) to yield fast error estimations at any /j. 




12 3 4 5 6 

CV basis size 



Figure 4: Maximum, mean and minimum in a sample of MC 
estimations Vmksi (^'' ~ 2/= i '^f various /I, as a function of 
/. The decrease rate is less fast among pai'ameter values A not 
used by the greedy algorithm, and less fast than in Fig.|3] 

10x10x2'*- values for fi. Although this is a coarse 
training sample, we never had to enrich the RB ap- 
proximation space Xfj "online", that is during the 
MC simulations. Thus, for UQ problems similar to 
our example, although our MC approach does not 
aim at exploiting optimally the regularity of the so- 
lution, in particular because the choice of trial pa- 
rameter values for a greedy algorithm is "blind", 
the RB control-variate MC method is clearly inter- 
esting: it is simple to implement, without much a 
priori knowledge of the solution, and accurate at a 
reasonable computational cost ("offline" and "on- 
line" computations take a few minutes on a single 
processor unit here). 

For 6 - 0.05, we mentionned previously that 
the problem is not to reduce the variance of MC 
estimations, but to get certified MC estimations 
when one uses a KL representation of the input 
noise at each of the MC realizations, because the 
parametrization of the latter needs approximately 
K = 10 dimensions, which cannot be explored 
finely "offline" in a reasonable time (one mani- 
festation of the well-known "curse of dimension- 
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ality")- One way out is to simulate differently the 
noise, in a way that could invoke RB surrogates 
in a low-dimensional parameter space only ll38l . 
We shall study this approach in future works. An- 
other way if we want to use ICL representations 
is, first, to observe that since the error per realiza- 
tion is averaged in E{s;^ xni(A)) one can still af- 
ford some realizations with large error as shown 
in |n4|. Moreover, since the solution space very 
much depends on the first KL modes only as ob- 
served in llJl, one can build here "offline" surro- 
gates u/st^K.N in a space of low dimension. Here, 
we could use RB surrogates from a small linear 
space Xn of dimension N = 20 without having 
to enrich "online", although X^ was built "of- 
fline" with a greedy algorithm exploring only a 
very coarse training sample of trial parameter val- 
ues (using 10 trial values for each parameter com- 
ponent Ai = k2,A2 - E and 2 trial values for 
each parameter component Y yfAkZi,, k - 1 ... 10, 
while for ^ = 1 1 ... 70 the parameter components 
were fixed). Then, compared with a direct "naive" 
MC simulation of the law of = s/^ k,n{^) 
any A e A, the RB control-variate MC method 
provides computational reduction by reducing the 
variance of MC estimations like in the case where 
6 = 0.5, see Fig.|4]in comparison with Fig. [3] Note 
yet that the variance V(Z^) is approximately one 
order of magnitude smaller when 6 = 0.05 than in 
the previous example when 6 - 0.5, and the vari- 
ance decreases a bit more slowly than before, so 
the gain at same accuracy is a bit less than one or- 
der of magnitude smaller (and non zero only if we 
require a minimum precision on the confidence in- 
terval of the MC estimations of a bit more than one 
order of magnitude higher than when 6 = 0.05). 

Last, we would like to mention a way to reach 
higher KL truncation order K which is not spe- 
cific to our choice of the correlation length and 



where the RB control-variate MC method can still 
be useful. Note first that, quite often, surro- 
gates uu.k,n{^) built for a coarse training sam- 
ple (say with only 1 trial parameter value in most 
directions) are already not too bad because the 
KL spectrum decays anyway. So, at any A, one 
can expect Vis;^x,N'i^) - -^n,k,n(^)) to be signif- 
icantly smallel*^ than V(s/^^k,n(^)) and next im- 
prove the MC estimations of E(s/^^kn{A)) "on- 
line" at a very reasonable cost, by the combina- 
tion of the RB control-variate MC method and the 
Multi-Level Monte-Carlo (MLMC) method ED in 
two steps: estimate (i) E{s/^ kn(A)) with the RB 
control-variate MC method and (ii) E{s/^^K.N'i^) - 
su.kM'^)) with a small number M,'^^, of realiza- 
tions. The dimension A^' > N of the enriched 
basis at A can be chosen a posteriori in (ii) from 
the few M,es, realizations in the MC estimation 
EMi„,{'^N,K,Ni'l) - Y"*) of (i). In turn, the actual / in 
(i) can be re-adjusted after (ii) to balance the statis- 
tical error in Em[^^^(su,k.n'(^) - ^^u.kM'^)) such that 
Vm,,A^n,kM^) - ?'^)/Mtest ~ VmiJsu.k,n'(^) - 
-^u,k.nW) latest if possible. Of course, the latter 
new strategy deserves a specific study, which we 
also keep for future works. 

4. Application to Bayesian estimation 

Let us now numerically demonstrate the effi- 
ciency of the RB control-variate MC method in 
another context useful for UQ, with parameteri- 
zations of higher dimension than in the previous 
section. As example, we consider the computa- 
tion of a Minimum-Mean-Square-Error (MMSE) 



In our numerical example above, Vmj. j(*yV,A:,A"('l) - 
^N,k,nW) is in fact always close to zero machine by the prop- 
erty mentionned above: at a moderately high precision level, 
the solution space is almost entirely spanned by variations in 
the first K modes only. But of course this would not be true for 
random fields with thicker distribution tails. 
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Bayes estimator in various parametric settings. To 
identify various possible cases with many queries 
in the parameter, we first use a toy-model. Second, 



we use the same model PDE problem (3.1-3.2i 
as in the previous section. In the latter case, we 
also combine the RB control-variate MC method 
with standard RB surrogates of the PDE solutions, 
like in the previous section. But of course, the use 
of surrogate models in Bayesian estimation is not 
new 1.38.. 20J . 

4.1. A toy model for various Bayesian estimations 

The Bayesian estimation of some unknown in- 
put parameter in a model consists in improv- 
ing a prior guess of its probability distribution tt^ 
(the probabilistic counterpart of the determinis- 
tic Tikhonov regularization in inverse problems, 
see ||9l e.g.) by observations of an output s'^{0) 
of the model that are viewed as realizations of 
a random variable ||43l . We denote by /I a con- 
trol parameter entering the model, and by ^ a 
so-called hyper-parameter in the Bayesian frame. 
The random variations of s^{9) can be generated 
by aleatoric noise (for a given fixed value of 9, 
the truth is stochastic and induces a distribution 
of value s'^iO)) as well as the epistemic uncertainty 
about the parameter (our observations of the truth 
are noised). 

When one knows the value of a control param- 
eter /I in a model, as well as J observations sj, 
j = 1 ... 7 (7 i.i.d. realizations of s^(6)) with likeli- 
hood f^'H^jlS), the probability density of s^ know- 
ing 6), then Bayes formula allows one to compute 
the posterior distribution of 

J 

n^'^'H0\{s'j}) <x 7r^(0) Y] f'Hs']\0) (4. 1) 

j=i 

where ^ is another hyperparameter (entering the 
likelihood and not the prior Uke ^). The poste- 



rior is used to compute the probability distribu- 
tion of quantities that depend on 0. It is also 
used to compute a deterministic approximation 
gU.^(^{sp of given {ij,; = 1.../), for in- 
stance the Minimum-Mean-Square-Error (MMSE) 
estimator p4l p. 349], which in decision theorjj^ 
is interpreted as the minimum of the expected 
quadratic loss E(\0^'^'^{{sj}) - 0p), averaged over 
both distributions of and (ij, y = 1 . . . 7) 

0''('(({sj}):^ j07T''^''^{0\{sj})d0. (4.2) 



In practice, expectations using ( |4.1| l like ( |4.2| l 
are typically computed many times: 

• for many values of the control parameter A en- 
tering the model for the "truth", 

• using many different sets of observations 

y = 1 ... 7), typically at Aq - A but not 
necessarily, and 

• for many values of the hyperparameters ^, ^ 
entering the Bayesian frame (prior and likeli- 
hood are often chosen in - conjugate - para- 
metric families of distributions). 

Think of the various pieces produced in a factory, 
or of real-time estimation procedures, where one 
may want to vary all these parameters ! We would 
thus like here to accelerate the computation of 



Other risk functions than the quadratic loss and other de- 
cision rules than the risk minimization are also used in prac- 
tice 1441 1431 . which lead to different estimators for 8. For in- 
stance, one also uses the expected linear loss function E{\S - 0\) 
which is minimal at the median 5, defined by P{9 < 6\{s'^\) = I , 
and the maximum likelihood principle that takes the maximum 
a posteriori maxo P(9\{sj]) as estimator. In this work, we con- 
sider only the MMSE estimator j4.2| . Notice that it is a good 
example for applications where one is interested by the expec- 
tation of a smooth functional of 6 in the end, because its com- 
putational complexity with a MC method if of the same kind. 
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many parametrized MMSEs in many-query frame- 
works with our RB control-variate MC method and 
numerically explore the various parametrized set- 
tings with a toy-model first. 

Let us choose a Bayesian frame for suppos- 
edly Gaussian observations of with known mean- 
square deviation A, typically 



Gaussian prior and likelihood are conjugate one- 
another and the MMSE is analytically computable 



({sj}) 



0-2 + A^/J 



1 



A^/J 



(T^ + A^lf 



with variance 



(see ill p.353] e.g.). Note 
that for this simple model, there is no degree of 
freedom for a hyperparameter ^, but one already 
sees why one may want to use various values of the 
hyperparameter ^. Although the MMSE Bayes es- 
timation is asymptotically consistent and efficient 
(in fact, normally distributed) as 7 — > oo, the qual- 
ity of the estimation is clearly impacted by the 
choice of the hyperparameter at finite J. See the 
posteriors computed with J - 1, 10, 100 observa- 
tions and 0Q - \,A-.5,ii-.9,a--Am Fig.|5] 

More generally, one can show under regular- 
ity assumptions on the likelihood f'^'^{-\ff) (see ||451 
p.490] e.g.) that, if the observations are indeed 
realizations of a random variable s'^{0) with den- 
sity f'^'^(-\0) for one fixed value - 0q (hence 
distributed only due to the aleatoric noise of the 
model), then the MMSE converges in distribution 
when the number J of observations increases 

V7(0^'«-0o)^MO,/(0o)"'), 

with variance given by the Fisher information 
I(0o) ^E(^{dglogf({sj}\0)f \0 = 0o) . The MMSE 
estimator is thus asymptotically consistent and 
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Figure 5: Posterior distributions computed for J = 1, 10, 100 
i.i.d observations sj ~ N(8(, = \,A^ = .25) (top) ; and varia- 
tions of the MC MMSE Bayes estimator (middle) and its vari- 
ance (bottom) as a function of A for lO' realizations. 
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asymptotically efficient and the choice of the prior 
becomes irrelevant as 7 — > oo. Yet, for finite sam- 
ple of observations, a good MMSE Bayes estima- 
tion strongly depends on the choice of the prior, in 
particular because MMSE Bayes estimators are bi- 
ased. So one may try to optimize the hyperparam- 
eters given training samples of observations, pos- 
sibly for various values of the control parameter: a 
natural many-query problem for the MC computa- 
tion of parametrized MMSEs ! Let us see how the 
RB control-variate MC method performs in vari- 
ous parametrized settings. 

We do many MC estimation of the MMSE with 
7=10 i.i.d. synthetic observations generated with 
distribution law N{9o,Aq) and Oq = = .9,cr = 
A. We tested the RB control-variate MC method 
for various meaningful parametric variations: 

• /I e (.1, .9) with one set of observations gen- 
erated at Ao - .5 fixed or at Aq = A, 

• A e (.1, .9) and many sets of observations gen- 
erated at Ao - .5 fixed or at Aq = A, 

• ^ = i^i,o-) e (.5,1.5) X (.1,. 9), ^ e (.1,.9) 
and many sets = 1 ... 7) of observa- 
tions generated at Aq - .5 fixed or at Aq = A. 

We always obtained meaningful estimations, and 
good variance reductions (more than 10 orders of 
magnitude) with only a few variates (less than 
10). We show in Fig. [6] the "most difficult" case, 
which is still of course quite easy because of the 
very smooth dependence of our toy-model on the 
parameter. But this shows at least that even in 
potentially high-dimensional parameter contexts, 
there is some hope for the RB control-variate MC 
method to be useful. 

4.2. A PDE model with uncertain coefficients 

We now consider the PDE problem p.lf|3.2[ ) as 
a parametrized numerical model for Bayesian esti- 
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Figure 6: Variations of MC MMSE Bayes estimation (top) and 
tlieir non-reduced empirical variance (middle) as functions of 
(p, o"), using one set of observations and one A. Bottom: em- 
pirical MC estimations of the reduced variances as functions 
of the number / of variates for 20' values of the parameter 
y = 1 ■ ■ ■./), 'I, f = (/J, o"))- Each colored line is the log- 
variance for one parameter value. 
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mation. It is discretized like in the previous sec- 
tion: given 6, one fixes the discretization parame- 
ters N, K and A^. Note that we invoke the standard 
RB method p^/SJ] to compute fast numerous so- 
lutions M to p.lf|3.2[ ) at each of the numerous val- 
ues of the PDE coefficients invoked, which has 
already been used in a Bayesian estimation con- 
text with a deterministic quadrature formula for the 
output expectation p20l rather than MC. 

We still denote A-{k2, E) the parameter in the 
model. In addition to s'^ defined in p.7| i, we also 
consider the temperature averaged at the top of the 
fin o'* = u as output. Clearly, there are many 

situations where the uncertain coefficient b can a 
priori be modelled using only some rough repre- 
sentation as a random field p.8| l. Then, it often 
needs improving in a real setting using data from 
observations. This is the case of the permeability 
field for the Darcy equations in f38l e.g. 

Here, we shall try to improve i.i.d. Gaussian 
prior^Zi ~ TT^ = NiO,^k\ = cr^, for the 
random variables Z^, k - I . . .K, using J i.i.d. 
observations o^°, j - 1 ... 7, of the ouput o'*° at 
some value Aq of the control parameter. Assum- 
mg one knows that the likelihood /'''^(o''°|{Zi)) of 
the observations given a realization {Z^} of the 
set of random variables Z^, k - 1 . . .K, are also 
Gaussian f'^ioj\{Zk}) cx expHoj - o\{Zk])\VO^ 
( = cr^g, then the posterior distributions of the 
independent random variables can be used to 
describe the specific context associated with the 
observations o^°, j - 1 ... 7. (They are a pri- 
ori better than the prior guesses and should be- 
come all the better as the number of observa- 



For well-posedness of the PDE we will need to numeri- 
cally truncate the Gaussian realizations by excluding those re- 
alizations where b < 0. This is a poor man's "truncated Gaus- 
sian", but in practice the standard deviations crj are chosen suf- 
ficiently small for most realizations to be in the range of 
admissibility where the surrogate model is valid. 



tions increases.) In particular, the model posterior, 
given by the Bayes formula nf=i 7r^'^'^(Zi|{o^'°)) cc 
nj=i/''^(of|{Z^})nf=i4(Z-i), can be used to 
compute various quantities of interest, like the ex- 
pectation of s'^. Let us use the RB control- variate 
MC method to evaluate fast E.„.u.((s'^\{o'^°]) for var- 
ious values of the hyper parameters ^, ^, various 
values of the control parameter A in the model and 
possibly various sets of observations (at Ao - A 
possibly, but not necessarily). Quite importantly, 
note that here we assumed that we know explicitly 
the likelihood function /''•^(■|{Zt)) by experience. 
This is a specific Bayesian framework which is of- 
ten used, although the exact likelihood function is 
often not known in practice (see Rem.[T]i. 

We use the RB control-variate MC method to 
compute the MMSE of s'^ with the parametrized 
posterior above in different settings for S - .5,K = 
10, = 12. For instance: (i) with 4 x4 x 2^^ = 2'2 
parameter values {k2,E,crfj e [.1, 10] x [.1, 1] X 
[lO"'*, lO""*] and one fixed 7=1 observation Oj" 
generated synthetically by uncertainty propagation 
at Aq - (2,0.5) or (ii) with 10 x 10 parameter val- 
ues [k2,E) e [.1,10] X [.1,1] and 10 values for 
each of the 7 = 3 observations o^°J = 1 ... 7, gen- 
erated synthetically by uncertainty propagation at 
Aq - (2, 0.5). In both cases, the RB control-variate 
technique can bring computational reductions to a 
direct MC method as one can see by confronting 
the numerical results in Fig. [7] with the reasoning 
detailed in Section|3] 

Remark 1. Sometimes we do not know how to 
choose the likelihood function and we need to pre- 
compute it numerically first. Let us illustrate this, 
in another Bayesian context for the sake of simplic- 
ity in the computations. For instance, assume that 
we want to esimate probabilities for k2from suffi- 
ciently many observations of the output s'* whose 
random fiuctuations are generated by those of b. 
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Figure 7: Top: maximum and mean in a sample of MC esti- 
mations V^,(,f.f (i"^ - 2/= I for vaiious training values 
of A, ^, ^ and lo-*"!, as a function of / during the greedy search 
("offline" stage) when S = .5, K = 10, N = 12 in case (ii) 
(case (i) is similar, with a decrease rate in fact a bit faster). Bot- 
tom: maximum, mean and minimum of the same estimators 
for various trial values of A, f , f and {oj° ) (average using many 
realizations) during the "online" stage. 



To estimate k2 from J i.i.d. observations Sj", j = 
1 ... 7, with likelihood /^"(sj^l^j) depending only 
on the law of b at the unknown value k2 = k^, we 
could use a Bayesian approach with a prior distri- 
bution ki ~ n. The posterior distribution 

J 

7^\k2\[sf]) (X n{k2) W f%sf\k2) (4.3) 

allows one to compute quantities that depend on k2 
more accurately than with the prior distribution. 
(At least if the assumptions of the Bayesian model 
are satisfied by the "truth" and if the number of 
observations J is large enough.) But first we need 
to precompute the likelihood f^''{-\k2). 

A "kernel " approach for instance allows one to 
precompute the parametrized Probability Density 
Functional (PDF) from a sample of realizations at 
specific values of the controlled parameter Eq. It 
amounts to give to each realization ofs^" a weight. 
In Fig. [S] we show the approximate likelihood for 
observations at one parameter value Eq, after 
reconstruction of the PDF from a sample of real- 
izations of s^''-'^"^ at various values ofk2, syntheti- 
cally generated with a numerical MC simulation of 
our model and weighted by a truncated Gaussian 
weight with hyperparameters ai, a2, M 

Z K-^-'-\<.f^^ ■ (4.4) 

n7=l 

We numerically checked that the RB control- 
variate MC method was still efficien^^to compute 
many expectations of a scalar functional of the un- 
certain parameter k2 using ( |4.4| l as likelihood in 



" Efficiency is meant with respect to variance reduction. 
Of course k2 is only a scalar in the example here and one 
can argue that most often the MC method is not the best in- 
tegration method then. Our example is nevertheless an illustra- 
tion (admittedly simple) of more general situations with high- 
dimensional k2 when one uses a robust MC-like method. 
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Figure 8: Empirical PDF j4j4) of k2 at fixed Eq for M = \(f 
realizations with truncated Gaussian weight ai = .5,a2 = .01. 

the posterior ( |4.3| l. We used it for many values of 
the control parameter Eq, of the set of observa- 
tions [s^"} and of the hyper parameters ci, aa, M. 
But notice that actually using ( |4.4| l in the poste- 
rior ( |4.3[ ) is very time-consuming, especially with 
large M. Moreover, one needs to interpolate ( |4.4| l 
at the values of Eq where it has not been precom- 
puted. So the computation of probability densi- 
ties, for instance with a kernel approach, remains 
a challenging numerical problem in Bayesian es- 
timation. And the computational reductions pro- 
posed in the present work do not tackle the latter 
point. But note that clearly, to improve the com- 
putation of ( |4.4| l at many parameter values Eq in 
particular, RB ideas could still be useful. Yet this 
is still another topic for another work. 

5. Conclusion and Perspectives 

In this article, we have presented new elements 
of analysis of the RB control-variate MC method 
(error estimates and convergence results) as well 
as new useful applications (to uncertainty quantifi- 
cation of PDEs with stochastic coefficients). But 
a number of questions remains, about theory and 
practice, which may be the object of future works. 

First, our convergence results are very much in- 



spired from those in 1271 l28l about the standard 
RB method for PDEs and the same limitations ap- 
ply: (i) they rely on assumptions that are diffi- 
cult to check in practice (the fast decay of Kol- 
mogorov widths), (ii) they may still be very pes- 
simistic in comparison with practical possibilities 
(the constant in the upper-bound may be far subop- 
timal and a theoretical gain could be seen only for 
uninterestingly large dimensions / of the reduced 
basis), and (iii) they do not help choosing a good 
training sample A of trial parameter values for the 
greedy algorithm (especially in the cases where A 
is large). Besides, (iii) is sometimes not only due 
to a lack of answer to the question how to het the 
optimal reduction, but also to the question how to 
get any reduction, when A is large. Indeed, al- 
though the maximal gain (in the infinitely-many- 
query limit or, equivalently, for real-time purposes) 
reads only like the ratio of the reduced vs. non- 
reduced marginal computational time per parame- 
ter value A, it is Umited in practice by the possi- 
bility to inspect all trial parameter values /I e A 
and should take into account that "offline" part of 
the work. That is why, in absence of theory for 
choosing A when A has a high-cardinality (pos- 
sibly infinite) and high dimension, one still needs 
numerical tests to check the capabilities of the re- 
duction method using specific (heuristic) choices 
of A. Fortunately, we have been able to show nu- 
merically that some gain is possible in practical sit- 
uations. 

In some UQ frameworks, we could actually 
prove numerically that some gains were possible 
by applying the RB control-variate MC method. 
But as any numerical proof, this was achieved on 
one specific model problem, for which some lim- 
itations were also clearly seen. So one would of 
course still need characterizing precisely the lim- 
itations for another specific problem. More pre- 
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cisely, one would need to specify numerically the 
efficiency regime in a given UQ context where 
a parametrized expectation has to be computed 
many time for many values of the parameter, which 
depends on the number of queries in the parameter 
values, but also on the decrease rate of the vari- 
ance with respect to the dimension of the reduced 
basis of control variates and on the required accu- 
racy. Furthermore, this efficiency regime will be 
reachable only for those problems whose "offline" 
work is actually do-able (in a human lifetime) and 
is thus limited in the dimension of the parame- 
ter A by the size of A, as well as in the numer- 
ical complexity of the underlying model for ran- 
dom realizations by the computation time of one 
single realization. That latter limit also sets new 
challenges for future works in UQ: the simulation 
of high-dimensional noise (with high-dimensional 
KL truncation order K) in the case of PDEs with 
stochastic coefficients, and the fast computation 
of empirical likelihoods (say by kernel approach) 
in Bayesian estimation. We already mentionned 
some tracks for future works about these issues in 
the course of this work. 

To conclude, this work is a direct improve- 
ment on iri4ll which we believe to be a useful nu- 
merical approach to some UQ problems, in par- 
ticular because it can bring huge computational 
gains at almost no cost to the simple and widely- 
used "naive" MC method. Notice that in any 
case we can certify the approximation error in all 
steps of the computations (of course in a proba- 
bilistic sense as for the MC method) and there is 
thus no loss of rigor in our approach compared 
with the naive MC method. Moreover, we hope 
that our suggestion, after lf20l . to use RB ideas 
in such fields as Bayesian estimations, which are 
known to have computationally demanding appli- 
cations, will bring some new thoughts in that com- 



munity, who might for instance want to identify 
new "many-query" opportunities of computational 
reductions using greedy algorithms. 

Appendix A. Weak greedy convergence 

An important tool for the proof is Lemma [T| 
which is a straightforward variation of a result 
proved in ||28] 2.2]. 



Lemma 1. If for some /c e N there exists Q < I < 
Ic, q,m e N and e (0, 1) such that 6a j < o-j+g„, 
then it also holds <ti < dm/\9 - q^^. Moreover, if 
the same hypothesis holds for O&i < &i^qm, then it 
holds&i < idm+Oicdi^) l\9—q~^-\with a probability 
more than 1 - e, more precisely in the events where 
it also holds &i - 0,dj < Y(Z'^'*' - 7-^'+') < o"/. 

Let us treat the case in the weak greedy algo- 
rithm where for a > it holds di < d^r" , VO < 
/ < lu- Assume we use the MC estimator M^^" 
whatever e e (0, 1). For / < 7^, the outcome 
((t)o<,</„ of the weak greedy algorithm is as fol- 
lows with probability more than 1 - e. For q e N>o, 
and ji,y ^ M>() to be determined later, we define 
^0 '■- \yq\ > and assume 



VC > 0, 3/,) < / < /m/o-/ > CdoF 



(A.l) 



Given C > 0, we denote the smallest / satisfy- 



ing A.l by Ic- If for < / < /c we can define 
mj e N by / + qmj < Iq < I + q{mj + 1) then, since 
(cr,),- is a decreasing sequence, it holds by (|A.1[), 



cTr < &,,iIclD" < <T/,,„ He I If ■ (A.2) 
Applying Lemma[T]with m-mj and - (I/Ic)" 
<Ti, <&!< d,„,{l + 0i,)/\il/lcr -q''-\ (A.3) 



(observe d,„, > dj^) contradicts ( |A.l| i as soon as 
one chooses C > ilc/mi)"(l+0i^)/\il/lc)"-q''^ \ > 
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0. In particular, if we set / = &Ic with some well- 
chosen e (0, 1 - l/yS - 1/r), then 



unApqm, > I3{lc-l-q) > m-Wc-q) > k- In 
fact, it is always possible to find / e N>o with m/ € 
N>() (mj > Ly/ySJ > in particular) provided one 
takes y > 1 large enough and sufficiently smaller 
/3>1. This requires C > (J3qf(l+ei,)/\&" -q-'2\. 
And finally, it also holds VO < / < k, o"/ < C<io/ " 
provided one chooses C such that 

C > meix{(fiqni+6,,)/\&"-q-'2\,(^o/do)Q > 0, 

which can still be optimized as a function of ^ > 0. 
So the outcome of the weak greedy algorithm sat- 
isfies VO < / < Im, <5"/ < Cdor" with probability 
more than 1 -e. A similar reasoning can be adapted 
from ||28]| for the other cases. 
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